Biblioteca Digital

2 resultados para RELATEDNESS

em Open University Netherlands

Combining Taxonomies using Word2vec

Relevância:

10.00% 10.00%

Publicador:

Resumo:

Taxonomies have gained a broad usage in a variety of fields due to their extensibility, as well as their use for classification and knowledge organization. Of particular interest is the digital document management domain in which their hierarchical structure can be effectively employed in order to organize documents into content-specific categories. Common or standard taxonomies (e.g., the ACM Computing Classification System) contain concepts that are too general for conceptualizing specific knowledge domains. In this paper we introduce a novel automated approach that combines sub-trees from general taxonomies with specialized seed taxonomies by using specific Natural Language Processing techniques. We provide an extensible and generalizable model for combining taxonomies in the practical context of two very large European research projects. Because the manual combination of taxonomies by domain experts is a highly time consuming task, our model measures the semantic relatedness between concept labels in CBOW or skip-gram Word2vec vector spaces. A preliminary quantitative evaluation of the resulting taxonomies is performed after applying a greedy algorithm with incremental thresholds used for matching and combining topic labels.

Veja mais

Finding the Needle in a Haystack: Who are the Most Central Authors Within a Domain?

Relevância:

10.00% 10.00%

Publicador:

Resumo:

The speed at which new scientific papers are published has increased dramatically, while the process of tracking the most recent publications having a high impact has become more and more cumbersome. In order to support learners and researchers in retrieving relevant articles and identifying the most central researchers within a domain, we propose a novel 2-mode multilayered graph derived from Cohesion Network Analysis (CNA). The resulting extended CNA graph integrates both authors and papers, as well as three principal link types: coauthorship, co-citation, and semantic similarity among the contents of the papers. Our rankings do not rely on the number of published documents, but on their global impact based on links between authors, citations, and semantic relatedness to similar articles. As a preliminary validation, we have built a network based on the 2013 LAK dataset in order to reveal the most central authors within the emerging Learning Analytics domain.

Veja mais

2 resultados para RELATEDNESS

em Open University Netherlands

Filtro por publicador

Combining Taxonomies using Word2vec

Finding the Needle in a Haystack: Who are the Most Central Authors Within a Domain?